Zooming in on NYC taxi data with Portal

نویسندگان

  • Julia Stoyanovich
  • Matthew Gilbride
  • Vera Zaychik Moffitt
چکیده

In this paper we develop a methodology for analyzing transportation data at di erent levels of temporal and geographic granularity, and apply our methodology to the TLC Trip Record Dataset, made publicly available by the NYC Taxi & Limousine Commission. This data is naturally represented by a set of trajectories, annotated with time and with additional information such as passenger count and cost. We analyze TLC data to identify hotspots, which point to lack of convenient public transportation options, and popular routes, which motivate ride-sharing solutions or addition of a bus route. Our methodology is based on using a system called Portal, which implements e cient representations and principled analysis methods for evolving graphs. Portal is implemented on top of Apache Spark, a popular distributed data processing system, is inter-operable with other Spark libraries like SparkSQL, and supports sophisticated kinds of analysis of evolving graphs e ciently. Portal is currently under development in the Data, Responsibly Lab at Drexel. We plan to release Portal in the open source in Fall 2017.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Impact of Fuel Efficiency Improvement on Driving Behaviors of NYC Taxi Drivers

Fuel efficiency has improved because of environmental policies and high gas prices. In most cases, increased vehicle usage is associated with negative outcomes because of potentially increasing emissions. In the New York City (NYC) taxi industry, however, increased vehicle usage corresponds to increased supply, which is meaningful because of limited the numbers of permissions and the fixed fare...

متن کامل

Characterizing Taxi Flows in New York City

We present an analysis of taxi flows in Manhattan (NYC) using a variety of data mining approaches. The methods presented here can aid in development of representative and accurate models of large-scale traffic flows with applications to many areas, including outlier detection and characterization.

متن کامل

Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Analysis of New York City

Electrification of transportation is critical for a lowcarbon society. In particular, public vehicles (e.g., taxis) provide a crucial opportunity for electrification. Despite the benefits of eco-friendliness and energy efficiency, adoption of electric taxis faces several obstacles, including constrained driving range, long recharging duration, limited charging stations and low gas price, all of...

متن کامل

Tuning Shape Parameter of Radial Basis Functions in Zooming Images using Genetic Algorithm

Image zooming is one of the current issues of image processing where maintaining the quality and structure of the zoomed image is important. To zoom an image, it is necessary that the extra pixels be placed in the data of the image. Adding the data to the image must be consistent with the texture in the image and not to create artificial blocks. In this study, the required pixels are estimated ...

متن کامل

Exploring Traffic Dynamics in Urban Environments Using Vector-Valued Functions

The traffic infrastructure greatly impacts the quality of life in urban environments. To optimize this infrastructure, engineers and decision makers need to explore traffic data. In doing so, they face two important challenges: the sparseness of speed sensors that cover only a limited number of road segments, and the complexity of traffic patterns they need to analyze. In this paper we take a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1709.06176  شماره 

صفحات  -

تاریخ انتشار 2017